A Novel Feature Extraction Technique for Speaker Identification
نویسنده
چکیده
This paper presents a novel feature extraction approach for speaker identification when the speech is corrupted by additive noise. The environmental mismatch between training and testing data degrades the performance of speaker identification system. The performance degradation is primarily due to presence of background noise when try to match a given speaker to the set of known speakers in a database. Mel frequency cepstral coefficients (MFCCs) are perhaps the most widely used front ends in the state of the art speaker identification systems. One of the major issues with MFCCs is that they are very sensitive to additive noise. To overcome this bottleneck, a temporal filtering procedure on the autocorrelation sequence is proposed to minimize the effect of additive noise. The proposed feature is called Relative Autocorrelation Mel Frequency Cepstral Coefficients (A-MFCC) which is derived based on filtering the temporal trajectories of short time one sided autocorrelation sequence. This filtering process minimizes the effect of additive noise. No prior knowledge of noise characteristics is required. The additive noise can be a colored noise. For speaker identification, Hindi database was constructed from the speech samples of each known speaker. Feature vectors (MFCCs and A-MFCCs) were extracted from the samples by short-term spectral analysis, and processed further by vector quantization for locating the clusters in the feature space. Experimental results indicated that A-MFCCs significantly improved the performance of speaker identification system in noisy environment.
منابع مشابه
Voice biometric feature using Gammatone filterbank and ICA
Voice biometric feature extraction is the core task in developing any speaker identification system. This paper proposes a robust feature extraction technique for the purpose of speaker identification. The technique is based on processing monaural speech signal using human auditory system based Gammatone Filterbank (GTF) and Independent Component Analysis (ICA). The measures used to assess the ...
متن کاملGammatone auditory filterbank and independent component analysis for speaker identification
Feature extraction is the key procedure when aiming at robust speaker identification. The most commonly used feature extraction techniques work successfully only in clean or matched environments. Accurate speaker identification is made difficult due to a number of factors, with handset/channel mismatch and environmental noise being the most prominent. This paper presents a novel technique which...
متن کاملSpeaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech: Speech Understanding / Interaction
Log area ratio coefficients (LAR) derived from linear prediction coefficients (LPC) is a well known feature extraction technique used in speech applications. This paper presents a novel way to use the LAR feature in a speaker identification system. Here, instead of using the mel frequency cepstral coefficients (MFCC), the LAR feature is used in a Gaussian mixture model (GMM) based speaker ident...
متن کاملText Independent Speaker Identification with Finite Multivariate Generalized Gaussian Mixture Model with Distant Microphone Speech
An effective and efficient speaker Identification (SI) system requires a robust feature extraction module followed by a speaker modeling scheme for generalized representation of these features. In recent, years Speaker Identification has seen significant advancement, but improvements have tended to be bench marked on the near field speech, ignoring the more realistic setting of far field instru...
متن کاملMFCC and its applications in speaker recognition
Speech processing is emerged as one of the important application area of digital signal processing. Various fields for research in speech processing are speech recognition, speaker recognition, speech synthesis, speech coding etc. The objective of automatic speaker recognition is to extract, characterize and recognize the information about speaker identity. Feature extraction is the first step ...
متن کامل